Using Argument Diagrams 1 Using Argument Diagrams to Improve Critical Thinking Skills in Introductory Philosophy
نویسنده
چکیده
In an experiment involving 139 students in an introductory philosophy course we tested whether students were improving their ability to think critically about arguments and whether using argument diagramming as an analysis aid contributed to this improvement. We determined that the students did develop this skill over the course of the semester. We also determined that the students in one section of the course gained significantly more than the students in the other sections, and that this was due almost entirely to their ability to use argument diagrams. We conclude that learning how to construct argument diagrams significantly improves a student’s ability to analyze, comprehend, and evaluate arguments. Using Argument Diagrams 3 Using Argument Diagrams to Improve Critical Thinking Skills in Introductory Philosophy In the introductory philosophy class at Carnegie Mellon University (80-100 What Philosophy Is) one important learning goal is the development of general critical thinking skills. Even though there are a few generally accepted measures of these skills (e.g. the California Critical Thinking Skills Test and the Watson Glaser Critical Thinking Appraisal, but see also Halpern, 1989 and Paul, Binker, Jensen, & Kreklau, 1990), there is surprisingly little research on the sophistication of, or on effective methods for improving, the critical thinking skills of college students. The research that has been done shows that the population in general has very poor skills (Perkins, Allen, & Hafner, 1983; Kuhn, 1991; Means & Voss, 1996), and that very few courses actually improvement these skills (Annis & Annis, 1979; Pascarella, 1989; Stenning, Cox, & Oberlander, 1995). Critical thinking involves the ability to analyze, understand, and evaluate an argument. Our first hypothesis is that students improved on these tasks after taking the introductory philosophy course. However, we wanted to determine not only whether they improved, but how much improvement could be attributed to alternative teaching methods. One candidate method is the use of argument diagrams as an aid to overall argument comprehension, since we believe that they significantly facilitate understanding, analysis, and evaluation. An argument is a series of statements in which one is the conclusion, and the others are premises supporting this conclusion; and an argument diagram is a visual representation of these statements and the inferential connections between them. For example, at the end of Meno, Plato (1976) argues through the character of Socrates that virtue is a gift from the gods (89d-100b). While the English translations of Plato’s works are among the more readable philosophical texts, it is still the case not only that the text contains many more sentences than just the propositions that are part of the argument, but also that, proceeding necessarily linearly, the prose obscures the inferential structure of the argument. Thus anyone who wishes to understand and evaluate the argument may reasonably be confused. If, on the other hand, we are able to extract just the statements Plato uses to support his conclusion, and visually represent the connections between these statements (as shown in Figure 1), the structure of the argument is immediately clear, as are the places where we may critique or applaud it. Recent interest in argument visualization (particularly computer-supported argument visualization) has shown that the use of software programs specifically designed to help students construct argument diagrams can significantly improve students’ critical thinking abilities over the course of a semester-long college-level course (Kirschner, et al. 2003; Twardy, 2004; van Gelder, 2001, 2003). But, of course, one need not have computer software to construct an argument diagram; one needs only a pencil and paper. However, to our knowledge there has been no research done to determine whether it is the mere ability to construct argument diagrams, or the aid of a computer platform and tutor (or possibly both) that is the crucial factor. Our second hypothesis is that the crucial factor in the improvement of critical thinking skills is the ability to construct argument diagrams. This hypothesis posits that students who construct correct diagrams during argument analysis tasks should perform better on these tasks than students who do not. Using Argument Diagrams 4 FIGURE 1 An argument diagram representing one of the arguments in Plato’s Meno. We typically teach several sections of Carnegie Mellon University’s introduction to philosophy course (80-100 What Philosophy Is) each semester, with a different instructor for each section. While the general curriculum of the course is set, each instructor is given a great deal of freedom in executing this curriculum. For example, each section is a topics based course in which epistemology, metaphysics, and ethics are introduced with both historical and contemporary primary-source readings. Each instructor, however, chooses a text, the order of the topics, and the assignments for his or her section. The students who take this course are a mix of classes and majors from all over the University. In the Spring of 2004, students in Section 1 were explicitly taught how to construct argument diagrams to represent a selection of text. In contrast, students in Sections 2, 3, and 4 were not explicitly taught the use of argument diagrams, but rather—if they were taught to analyze arguments at all—were taught to use more traditional kinds of representations (e.g. lists of statements). In this study, we test the first hypothesis by comparing the pretest and posttest scores of all the students in 80-100 in the Spring semester of 2004. We test the second hypothesis in three ways: (1) by comparing the pretest and posttest scores of students in Section 1 to students in Sections 2, 3, and 4, (2) by comparing the pretest and posttest scores of students who constructed correct argument diagrams on the posttest to those students who did not, and (3) by comparing total scores on individual questions on the posttest of students who constructed the correct argument diagrams for that question to those students who did not. Virtue is not knowledge Virtue is a true belief Virtue is either knowledge or true belief True belief is a gift from the gods Virtue cannot be taught Something can be taught if and only if it is knowledge Virtue guides correct actions Virtue is a gift from the gods Only knowledge and true belief guide correct actions There are no teachers of virtue Something can be taught if and only if it has teachers Using Argument Diagrams 5 Method Participants 139 students (46 women, 93 men) in each of 4 sections of introductory philosophy (80-100 What Philosophy Is) at Carnegie Mellon University in the Spring of 2004 were studied. Each section of the course had a different instructor and teaching assistant, and the students chose their section. There were 35 students (13 women, 22 men) in Section 1, 37 students (18 women, 19 men) in Section 2, 32 students (10 women, 22 men) in Section 3, and 35 students (5 women, 30 men) in Section 4. The students in Section 1 were taught the use of argument diagrams to analyze the arguments in the course reading, while the students in the other three sections were taught more traditional methods of analyzing arguments. Materials Prior to the semester, the four instructors of 80-100 in the Spring of 2004 met to determine the learning goals of this course, and designed an exam to test the students on relevant skills. The identified skills were to be able to, when reading an argument, (i) identify the conclusion and the premises; (ii) determine how the premises are supposed to support the conclusion; and (iii) evaluate the argument based on the truth of the premises and how well they support the conclusion. We used this exam as the “pretest” (given in Appendix A) and created a companion “posttest” (given in Appendix B). For each question on the pretest, there was a structurally (nearly) identical question with different content on the posttest. The tests each consisted of 6 questions, each of which asked the student to analyze a short argument. In questions 1 and 2, the student was only asked to state the conclusion (thesis) of the argument. Questions 3-6 each had five parts: (a) state the conclusion (thesis) of the argument; (b) state the premises (reasons) of the argument; (c) indicate (via multiple choice) how the premises are related; (d) provide a visual, graphical, schematic, or outlined representation of the argument; and (e) decide whether the argument is good or bad, and explain this decision. Procedure Each of the four sections of 80-100 was a Monday/Wednesday/Friday class. The pretest was given to all students during the second day of class. The students in sections 2 and 3 were given the posttest on the last day of classes, while the students in sections 1 and 4 were given the posttest as one part of their final exam, during exam week. Results and Discussion Test Coding Preand posttests were paired by student—single-test students were excluded from the sample— so that there were 139 pairs of tests in the study. Tests which did not have pairs were used for coder-calibration, prior to the coding of the 139 pairs of tests. Two graduate students independently coded all 278 tests (139 pairs). Each pre-/posttest pair was assigned a unique ID, and the original tests were photocopied (twice, one for each coder) with the identifying information replaced by the ID. We had an initial grader-calibration session in which the author and the two coders coded several of the unpaired tests, discussed our codes, and came to a consensus about each code. After this, each coder was given the two keys (one for the pretest and one for the posttest) and the tests to be coded in a unique random order. Using Argument Diagrams 6 The codes assigned to each question (or part of a question, except for part (d)) were binary: a code of 1 for a correct answer, and a code of 0 for an incorrect answer. Part (e) of each question was assigned a code of “correct” if the student gave as reasons claims about the truth of the premises and/or the support of premises for the conclusion. For part (d) of each question, answers were coded according to the type of representation used: Correct argument diagram, Incorrect or incomplete argument diagram, List, Translated into logical symbols like a proof, Venn diagram, Concept map, Schematic (e.g., P1 + P2/Conclusion (C)), Other or blank. To determine inter-coder reliability, the Percentage Agreement (PA), Cohen’s Kappa (κ) and Krippendorff’s Alpha (α) were calculated for each test (given in Table 1). TABLE 1 Inter-coder Reliability: Percentage Agreement (PA), Cohen’s Kappa (κ), and Krippendorff’s Alpha (α) for each test PA κ α Pretest .85 .68 .68 Posttest .85 .55 .54 The inter-coder reliability was fairly good, however, upon closer examination it was determined that one coder had systematically higher standards than the other coder on the questions in which the assignment was open to some interpretation (questions 1 & 2, and parts (a), (b), and (e) of questions 3-6). Specifically, on the pretest, out of 385 question-parts on which the coders differed, 292 (75%) were cases in which Coder 1 coded the answer as “correct” while Coder 2 coded the answer as “incorrect”; and on the posttest, out of 371 question-parts on which the coders differed, 333 (90%) were cases in which Coder 1 coded the answer as “correct” while Coder 2 coded the answer as “incorrect.” In light of this, the codes from each coder on these questions were averaged, allowing for a more nuanced scoring of each question than either coder alone could give. Since we were interested in how the use of argument diagramming aided the student in answering each part of each question correctly, the code a student received for part (d) of questions 3-6 were preliminarily set aside, while the addition of the codes received on questions 1 and 2, as well as parts (a), (b), (c), and (e) of questions 3-6 determined the raw score a student received on the test. TABLE 2 The variables and their descriptions recorded for each student Variable Name Variable Description Pre Fractional score on the pretest Post Fractional score on the posttest A* Averaged score (or code) on the pretest for question * B* Averaged score (or code) on the posttest for question * Section Enrolled section Sex Student’s sex Honors Enrollment in Honors course Grade Final Grade in the course Year Year in school The primary variables of interest were the fractional pretest and posttest scores (the raw score converted into a percentage), and the individual average scores for each question on the pretest and the posttest. In addition, the following data was recorded for each student: which Using Argument Diagrams 7 section the student was enrolled in, the student’s final grade in the course, the student’s year in school, the student’s sex, and whether the student had taken the concurrent honors course associated with the introductory course. Table 2 gives summary descriptions of these variables. Average Gain from Pretest to Posttest for All Students The first hypothesis was that the students’ critical thinking skills improved over the course of the semester. This hypothesis was tested by determining whether the average gain of the students from pretest to posttest was significantly positive. The straight gain, however, may not be fully informative if many students had fractional scores of close to 1 on the pretest. Thus, the hypothesis was also tested by determining the standardized gain: each student’s gain as a fraction of what that student could have possibly gained. The mean scores on the pretest and the posttest, as well as the mean gain and standardized gain for the whole population of students is given in Table 3. TABLE 3 Mean fractional score (standard deviation) for the pretest and the posttest, mean gain (standard deviation), and mean standardized gain (standard deviation) Pre Post Gain GainSt. Whole Population 0.59 (0.14) 0.78 (0.12) 0.19 (0.01) 0.43 (0.03) The difference in the means of the pretest and posttest scores was significant (paired ttest; p < .001). In addition, the mean gain was significantly different from zero (1-sample t-test; p < .001) and the mean standardized gain was significantly different from zero (1-sample t-test; p < .001). From these results we can see that our first hypothesis is confirmed: overall the students did have significant gains and standardized gains from pretest to posttest. Comparison of Gains of Students by Section and by Argument Diagram Use Our second hypothesis was that the students who were able to construct correct argument diagrams would gain the most from pretest to posttest. Since the use of argument diagrams was only explicitly taught in Section 1, we first tested this hypothesis by determining whether the average gain of the students in Section 1 was significantly different from the average gain of the students in each of the other sections. Again, though, the straight gain may not be fully informative if the mean on the pretest was not the same for each section, and if many students had fractional scores close to 1 on the pretest. Thus, we also tested this hypothesis using the standardized gain. The mean scores on the pretest and the posttest, as well as the mean gain and standardized gain for the sub-populations of students in each section is given in Table 4. TABLE 4 Mean fractional score (standard deviation) for the pretest and the posttest, mean gain (standard deviation), and mean standardized gain (standard deviation) Pre Post Gain GainSt. Section 1 0.64 (0.14) 0.85 (0.10) 0.21 (0.02) 0.51 (0.07) Section 2 0.53 (0.16) 0.70 (0.14) 0.17 (0.03) 0.32 (0.05) Section 3 0.58 (0.14) 0.79 (0.08) 0.21 (0.02) 0.48 (0.04) Section 4 0.63 (0.10) 0.80 (0.09) 0.17 (0.02) 0.42 (0.05) Using Argument Diagrams 8 Since there was such variability in the scores on the pretest among the different sections, we ran an ANCOVA on the each of the variables Post, Gain, and GainSt, with the variable Pre used as the covariate. This analysis indicates that the differences in the pretest scores was significant for predicting the posttest scores (df = 1, F = 24.36, p < .001), the gain (df = 1, F = 125.50, p < .001), and the standardized gain (df = 1, F = 29.14, p < .001). In addition, this analysis indicates that, even accounting for differences in pretest score, the differences in the posttest scores among the sections were significant (df = 3, F = 8.71, p < .001), as were the differences in the gains (df = 3, F = 8.71, p < .001) and the standardized gains (df = 3, F = 6.84, p < .001). This analysis shows that a student’s section is a significant predictor of posttest score, gain, and standardized gain, but it does not tell us how they are different. The hypothesis is that the posttest score, gain and standardized gain for students in Section 1 is significantly higher than all the other sections. Thus, we did a planned comparison of the variables Post, Gain, and GainSt for Section 1 with the other sections combined, again using the variable Pre as a covariate. This analysis again indicates that the differences in the pretest scores was significant for predicting the posttest scores (df = 1, F = 32.28, p < .001), the gain (df = 1, F = 107.37, p < .001), and the standardized gain (df = 1, F = 21.42, p < .001). In addition, this analysis indicates that, even accounting for differences in pretest score, the differences in the posttest scores between Section 1 and the other sections were significant (df = 1, F =11.89, p = .001), as were the differences in the gains (df = 1, F = 11.89, p = .001) and the standardized gains (df = 1, F = 8.07, p = .005), with the average posttest score, gain, and standardized gain being higher in Section 1 than in the other three sections. Although these differences between sections (at least with standardized gain scores) obtained, they do not provide a direct test of whether students who (regardless of section) constructed correct argument diagrams have better skills. The explanation is that, although the students in Section 1 were the only students to be explicitly taught how to construct argument diagrams, a substantial number of students from other sections constructed correct argument diagrams on their posttests. In addition, a substantial number of the students in Section 1 constructed incorrect argument diagrams on their posttests. Thus, to test whether it was actually the construction of these diagrams that contributed to the higher scores of the students in Section 1, or whether is was the other teaching methods of the instructor for Section 1, we introduced a new variable into our model. Recall that the type of answer given on part (d) of questions 3-6 was the data recorded from the test. From this data, a new variable was defined that indicates how many correct argument diagrams a student had constructed on the posttest. This variable is PostAD (value = 0, 1, 2, 3, 4). The second hypothesis implies that the number of correct argument diagrams a student constructed on the posttest was correlated to the student’s pretest score, posttest score, gain and standardized gain. Since there were very few students who constructed exactly 2 correct argument diagrams on the posttest, and still fewer who constructed exactly 4, we grouped the students by whether they had constructed No correct argument diagrams (PostAD = 0), Few correct argument diagrams (PostAD = 1 or 2), or Many correct argument diagrams (PostAD = 3 or 4) on the posttest. The results are given in Table 5. Using Argument Diagrams 9 TABLE 5 Mean fractional score (standard deviation) for the pretest and the posttest, mean gain (standard deviation), and mean standardized gain (standard deviation) Pre Post Gain GainSt. No Correct 0.56 (0.16) 0.74 (0.12) 0.18 (0.02) 0.39 (0.03) Few Correct 0.57 (0.13) 0.75 (0.12) 0.17 (0.02) 0.37 (0.04) Many Correct 0.66 (0.13) 0.88 (0.06) 0.22 (0.02) 0.56 (0.06) Again, since there was such variability in the scores on the pretest among the different sections, we ran an ANCOVA on the each of the variables Post, Gain, and GainSt, with the variable Pre used as the covariate. This analysis indicates that the differences in the pretest scores was significant for predicting the posttest scores (df = 1, F = 24.68, p < .001), the gain (df = 1, F = 132.81, p < .001), and the standardized gain (df = 1, F = 30.97, p < .001). This analysis also indicates that, even accounting for differences in pretest score, the differences among the students who constructed 0, Few or Many correct argument diagrams on the posttest are significant (df = 2, F = 14.66, p < .001), as are the differences in gains (df = 2, F = 14.66, p < .001), and standardized gains (df = 2, F = 11.78, p < .001). This analysis shows that a whether a student constructed no, few or many correct argument diagrams is a significant predictor of posttest score, gain, and standardized gain, but it does not tell us how they are different. The hypothesis is that the posttest score, gain and standardized gain for students who constructed many diagrams is significantly different from both of the other groups. Thus, we did a planned comparison of the variables Post, Gain, and GainSt for the group of Many Correct with the other two groups combined, again using the variable Pre as a covariate. This analysis again indicates that the differences in the pretest scores was significant for predicting the posttest scores (df = 1, F = 23.67, p < .001), the gain (df = 1, F = 132.00, p < .001), and the standardized gain (df = 1, F = 31.29, p < .001). In addition, this analysis indicates that, even accounting for differences in pretest score, the differences in the posttest scores between students who constructed many correct argument diagram and the other groups were significant (df = 1, F =28.13, p < .001), as were the differences in the gains (df = 1, F = 28.13, p < .001) and the standardized gains (df = 1, F = 22.27, p < .001), with the average posttest score, gain, and standardized gain being higher for those who constructed many correct argument diagrams than for those who did not. These results show that the students who mastered the use of argument diagrams—those who constructed 3 or 4 correct argument diagrams—had the highest posttest scores and gained the most as a fraction of the gain that was possible. Interestingly, those students who constructed few correct argument diagrams were roughly equal on all measures to those who constructed no correct argument diagrams. This may be explained by the fact that nearly all (85%) of the students who constructed few correct argument diagrams and all (100%) of the students who constructed no correct argument diagrams were enrolled in the sections in which constructing argument diagrams was not explicitly taught; thus the majority of the students who constructed few correct argument diagrams may have done so by accident. This suggests some future work to determine how much the mere ability to construct argument diagrams aids in critical thinking skills compared to the ability to construct argument diagrams in addition to instruction on how to read, interpret, and use argument diagrams. Using Argument Diagrams 10 Prediction of Score on Individual Questions The hypothesis that students who constructed correct argument diagrams improved their critical thinking skills the most was also tested on an even finer-grained scale by looking at the effect of (a) constructing the correct argument diagram on a particular question on the posttest on (b) the student’s ability to answer the other parts of that question correctly. The hypothesis posits that the score a student received on each part of each question, as well as whether the student answered all the parts of each question correctly is positively correlated with whether the student constructed the correct argument diagram for that question. To test this, a new set of variables were defined for each of the questions 3-6 that had value 1 if the student constructed the correct argument diagram on part (d) of the question, and 0 if the student constructed an incorrect argument diagram, or no argument diagram at all. In addition, another new set of variables was defined for each of questions 3-6 that had value 1 if the student received codes of 1 for every part (a, b, c, and e), and 0 if the student did not. The histograms showing the correlations between constructing the correct argument diagram and answering correctly all parts of each questions are given in Figure 2. Completely Correct Answer to Question Given Presence/Absence of Correct Argument Diagram
منابع مشابه
Using Argument Diagramming Software to Teach Critical Thinking Skills
There is substantial evidence from many domains that visual representations aid various forms of cognition. We aimed to determine whether visual representations of argument structure enhanced the acquisition and development of critical thinking skills within the context of an introductory philosophy course. We found a significant effect of the use of argument diagrams, and this effect was stabl...
متن کاملThe Improvement of Critical Thinking Skills in What Philosophy Is
After determining one set of skills that we hoped our students were learning in the introductory philosophy class at Carnegie Mellon University, we designed an experiment to test whether they were actually learning these skills. In addition, there were four sections of this course in the Spring of 2004, and the students in Section 1 were taught the material using argument diagrams as a tool to ...
متن کاملLet’s argue over it: Are argumentation skills better learned collaboratively or individually?
In this paper we present a study of the use of computers for teaching argumentation through argument diagramming. We specifically focus on whether students, when provided with an argument-diagramming tool, create better diagrams, are more motivated, and learn more when working with other students or on their own. Related research has shown that the construction of visual representations, such a...
متن کاملNo Computer Program Required: Even Pencil-and-Paper Argument Mapping Improves Critical Thinking Skills
Argument mapping software abounds, 1 and one of the reasons is that using the software has been shown to teach/promote/improve critical thinking skills. These positive results are very encouraging, but they also raise the question of whether the computer tutorial environment is producing these results, or whether learning argument mapping, even with just paper and pencil, is sufficient. Based o...
متن کاملTeachable Agents in Mathematics Education
To investigate whether argument mapping, with a use of RationaleTM software, cultivates students’ critical thinking skills, an experimental research was designed and implemented in Cyprus’ primary schools. Thirty 6 th grade primary classes participated in the research for a period of nine months. Three groups of ten classes each were formed, of which two were the experimental groups and the thi...
متن کامل